Combining Stacking With Bagging To Improve A Learning Algorithm
نویسندگان
چکیده
In bagging Bre94a] one uses bootstrap replicates of the training set Efr79, ET93] to improve a learning algorithm's performance, often by tens of percent. This paper presents several ways that stacking Wol92b, Bre92] can be used in concert with the bootstrap procedure to achieve a further improvement on the performance of bagging for some regression problems. In particular, in some of the work presented here, one rst converts a single underlying learning algorithm into several learning algorithms. This is done by bootstrap resampling the training set, exactly as in bagging. The resultant algorithms are then combined via stacking. This procedure can be viewed as a variant of bagging, where stacking rather than uniform averaging is used to achieve the combining. The stacking improves performance over simple bagging by up to a factor of 2 on the tested problems, and never resulted in worse performance than simple bagging. In other work presented here, there is no step of converting the underlying learning algorithm into multiple algorithms, so it is the improve-a-single-algorithm variant of stacking that is relevant. The precise version of this scheme tested can be viewed as using the bootstrap and stacking to estimate the input-dependence of the statistical bias and then correct for it. The results are preliminary, but again indicate that combining stacking with the bootstrap can be helpful.
منابع مشابه
Stacking Bagged and Dagged Models
In this paper, we investigate the method of stacked generalization in combining models derived from diierent subsets of a training dataset by a single learning algorithm, as well as diierent algorithms. The simplest way to combine predictions from competing models is majority vote, and the eeect of the sampling regime used to generate training subsets has already been studied in this context|wh...
متن کاملA comparison of stacking with MDTs to bagging, boosting, and other stacking methods
In this paper, we present an integration of the algorithm MLC4.5 for learning meta decision trees (MDTs) into the Weka data mining suite. MDTs are a method for combining multiple classifiers. Instead of giving a prediction, MDT leaves specify which classifier should be used to obtain a prediction. The algorithm is based on the C4.5 algorithm for learning ordinary decision trees. An extensive pe...
متن کاملAn Eecient Method to Estimate Bagging's Generalization Error
Bagging [1] is a technique that tries to improve a learning algorithm's performance by using bootstrap replicates of the training set [5, 4]. The computational requirements for estimating the resultant generalization error on a test set by means of cross-validation are often prohibitive for leave-one-out cross-validation one needs to train the underlying algorithm on the order of m times, where...
متن کاملApplication of Bagging, Boosting and Stacking to Intrusion Detection
This paper investigates the possibility of using ensemble algorithms to improve the performance of network intrusion detection systems. We use an ensemble of three different methods, bagging, boosting and stacking, in order to improve the accuracy and reduce the false positive rate. We use four different data mining algorithms, naïve bayes, J48 (decision tree), JRip (rule induction) and iBK( ne...
متن کاملWhy Does Bagging Work? A Bayesian Account and its Implications
The error rate of decision-tree and other classi-cation learners can often be much reduced by bagging: learning multiple models from bootstrap samples of the database, and combining them by uniform voting. In this paper we empirically test two alternative explanations for this, both based on Bayesian learning theory: (1) bagging works because it is an approximation to the optimal procedure of B...
متن کامل